Methods for Partial Sentence Recognition and Unknown Words Detection by Sentence Spotting on Continuous Speech

نویسندگان

  • Yoshiaki Itoh
  • Jiro Kiyama
  • Ryuichi Oka
چکیده

Spontaneous speech includes many sentences that fall outside the task domain. Furthermore, the boundary between sentences is often unclear in spontaneous speech because of the likes of corrections, stammering or overlap with the next utterance. We previously developed a sentence spotting system that uses Vector ContinuousDynamic Programming (VCDP). This system works well for sentence spotting in spontaneous speech [1] because it is not required to consider sentence boundaries and utterances which fall outside the task domain. The previous system supported only “complete sentence” utterances. However, partial sentences that are intended to convey almost the same meaning as complete sentences, and which consist of parts of complete sentences often appear in spontaneous speech. We must be able to deal with such expressions to enable flexible recognition, even though such partial sentences are subject to a wide degree of variation. We propose a means of extending a sentence spotting algorithm that is capable of efficiently accepting partial sentences [2]. The processing of unknown words is one of the most important factors in dealing with spontaneous speech, because an utterance will often include words which are unknown to the system. We also propose an unknown word detection algorithm in the sentence spotting framework. We have extended the sentence spotting algorithm such that it is now capable of accepting partial sentences and detecting unknown words [3].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

بررسی وضوح گفتار کودکان فلج مغزی اسپاستیک 8 تا 12 ساله

Background and purpose: Speech intelligibility refers to how speech is understandable by listeners.  This study examined speech intelligibility in children (Persian native speakers) with spastic cerebral palsy aged 8-12 years old. Materials and methods: A cross-sectional study was performed in 31dysarthric students (….. boys and …..girls)  in Tehran, 2014. A list of w...

متن کامل

Iranian EFL Learners’ Lexical Inferencing Strategies at Both Text and Sentence levels

Lexical inferencing is one of the most important strategies in vocabulary learning and it plays an important role in dealing with unknown words in a text. In this regard, the aim of this study was to determine the lexical inferencing strategies used by Iranian EFL learners when they encounter unknown words at both text and sentence levels. To this end, forty lower intermediate students were div...

متن کامل

Keyword spotting in auto-attendant system

In this paper, an auto-attendant system using finite state grammar (FSG) based on a continuous speech recognition (CSR) model is introduced. However, by using two virtual garbage models, one is to match the leading extraneous speech before the key name and the other to match the tailing extraneous speech following the key name, we managed to reach a more flexible and robust auto-attendant syste...

متن کامل

Recovery from false rejection using statistical partial pattern trees for sentence verification

In conversational speech recognition, recognizers are generally equipped with a keyword spotting capability to accommodate a variety of speaking styles. In addition, language model incorporation generally improves the recognition performance. In conversational speech keyword spotting, there are two types of errors, false alarm and false rejection. These two types of errors are not modeled in la...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994